AITopics | information projection

Collaborating Authors

information projection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fd5013ea0c3f96931dec77174eaf9d80-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 02:43:58 GMT

classifier, dataset, fairprojection, (14 more...)

Neural Information Processing Systems

Country:

South America > Brazil (0.04)
North America > Canada > Ontario > Hamilton (0.04)
Europe > France (0.04)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Beyond Adult and COMPAS: Fair Multi-Class Prediction via Information Projection

Neural Information Processing SystemsDec-25-2025, 19:31:05 GMT

We consider the problem of producing fair probabilistic classifiers for multi-class classification tasks. We formulate this problem in terms of ``projecting'' a pre-trained (and potentially unfair) classifier onto the set of models that satisfy target group-fairness requirements. The new, projected model is given by post-processing the outputs of the pre-trained classifier by a multiplicative factor. We provide a parallelizable, iterative algorithm for computing the projected classifier and derive both sample complexity and convergence guarantees. Comprehensive numerical comparisons with state-of-the-art benchmarks demonstrate that our approach maintains competitive performance in terms of accuracy-fairness trade-off curves, while achieving favorable runtime on large datasets. We also evaluate our method at scale on an open dataset with multiple classes, multiple intersectional groups, and over 1M samples.

adult and compa, fair multi-class prediction, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 20:19:03 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Overview: The paper proposes a framework for enforcing structure in Bayesian models via structured prior selection based on the maximum entropy principle. Although the optimal prior may not be tractable, the authors developed an approximation method using submodule optimization. Contructing priors with structured variables is an important topic, so this method should be able to make good impact. Quality The paper is technically sound.

artificial intelligence, constraint, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Add feedback

Beyond Adult and COMPAS: Fair Multi-Class Prediction via Information Projection

Neural Information Processing SystemsAug-19-2025, 22:03:39 GMT

We formulate this problem in terms of "projecting" a pre-trained (and potentially unfair) classifier onto the set of models that satisfy target

artificial intelligence, fairprojection, machine learning, (17 more...)

Neural Information Processing Systems

Country:

South America > Brazil (0.04)
North America > Canada > Ontario > Hamilton (0.04)
Europe > France (0.04)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

The Benefits of Balance: From Information Projections to Variance Reduction

Neural Information Processing SystemsMay-27-2025, 21:05:54 GMT

Data balancing across multiple modalities and sources appears in various forms in foundation models in machine learning and AI, e.g., in CLIP and DINO. We show that data balancing across modalities and sources actually offers an unsuspected benefit: variance reduction. We present a non-asymptotic statistical bound that quantifies this variance reduction effect and relates it to the eigenvalue decay of Markov operators. Furthermore, we describe how various forms of data balancing in contrastive multimodal learning and self-supervised clustering can be better understood, and even improved upon, owing to our variance reduction viewpoint.

balance, information projection, variance reduction, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Meta-Dependence in Conditional Independence Testing

Mazaheri, Bijan, Zhang, Jiaqi, Uhler, Caroline

arXiv.org Machine LearningApr-16-2025

Constraint-based causal discovery algorithms utilize many statistical tests for conditional independence to uncover networks of causal dependencies. These approaches to causal discovery rely on an assumed correspondence between the graphical properties of a causal structure and the conditional independence properties of observed variables, known as the causal Markov condition and faithfulness. Finite data yields an empirical distribution that is "close" to the actual distribution. Across these many possible empirical distributions, the correspondence to the graphical properties can break down for different conditional independencies, and multiple violations can occur at the same time. We study this "meta-dependence" between conditional independence properties using the following geometric intuition: each conditional independence property constrains the space of possible joint distributions to a manifold. The "meta-dependence" between conditional independences is informed by the position of these manifolds relative to the true probability distribution. We provide a simple-to-compute measure of this meta-dependence using information projections and consolidate our findings empirically using both synthetic and real-world data.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

2504.12594

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New Hampshire > Grafton County > Hanover (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.66)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

On Prior Distributions and Approximate Inference for Structured Variables

Oluwasanmi O. Koyejo, Rajiv Khanna, Joydeep Ghosh, Russell Poldrack

Neural Information Processing SystemsFeb-9-2025, 07:49:05 GMT

We present a general framework for constructing prior distributions with structured variables. The prior is defined as the information projection of a base distribution onto distributions supported on the constraint set of interest. In cases where this projection is intractable, we propose a family of parameterized approximations indexed by subsets of the domain. We further analyze the special case of sparse structure. While the optimal prior is intractable in general, we show that approximate inference using convex subsets is tractable, and is equivalent to maximizing a submodular function subject to cardinality constraints. As a result, inference using greedy forward selection provably achieves within a factor of (1-1/e) of the optimal objective value. Our work is motivated by the predictive modeling of high-dimensional functional neuroimaging data. For this task, we employ the Gaussian base distribution induced by local partial correlations and consider the design of priors to capture the domain knowledge of sparse support. Experimental results on simulated data and high dimensional neuroimaging data show the effectiveness of our approach in terms of support recovery and predictive accuracy.

artificial intelligence, information projection, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Beyond Adult and COMPAS: Fair Multi-Class Prediction via Information Projection

Neural Information Processing SystemsJan-19-2025, 08:37:24 GMT

We consider the problem of producing fair probabilistic classifiers for multi-class classification tasks. We formulate this problem in terms of projecting'' a pre-trained (and potentially unfair) classifier onto the set of models that satisfy target group-fairness requirements. The new, projected model is given by post-processing the outputs of the pre-trained classifier by a multiplicative factor. We provide a parallelizable, iterative algorithm for computing the projected classifier and derive both sample complexity and convergence guarantees. Comprehensive numerical comparisons with state-of-the-art benchmarks demonstrate that our approach maintains competitive performance in terms of accuracy-fairness trade-off curves, while achieving favorable runtime on large datasets.

adult and compa, fair multi-class prediction, information projection, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Benefits of Balance: From Information Projections to Variance Reduction

Liu, Lang, Mehta, Ronak, Pal, Soumik, Harchaoui, Zaid

arXiv.org Machine LearningAug-27-2024

Deep neural networks have shown remarkable success at learning task-specific representations of data when provided supervision from massive amounts of labeled training examples. Recent trends, however, have shifted toward taskagnostic, universal representations that may be easily fine-tuned or even have zero-shot capabilities out-of-the-box. Supervised learning, stricto sensu, is too limited a framework for these billion-parameter, data-hungry models, and a question at the heart of modern machine learning is learning from unlabelled, partially labeled, or weakly labeled data. This need has paved the way for the current generation of self-supervised learning (SSL) approaches that circumvent the need for large amounts of strong labels. In SSL, a model is trained on a generic pseudo-task that can be performed on unlabelled data, such as relating the two modalities of an image-caption pair or two augmentations of the same image. Despite several modern foundation models such as DINO (Caron et al., 2021; Oquab et al., 2024) and CLIP (Radford et al., 2021) being trained in this fashion, many aspects of SSL remain baffling. In particular, the training process of self-supervised models often outgrows and "breaks the rules" of the standard empirical risk minimization (ERM) toolkit. ERM combines two well-understood techniques: minibatch sampling and gradient-based optimization using backpropagation. SSL, on the other hand, adds clever, less-understood techniques to the training pipeline.

iteration, log 2, prop, (16 more...)

arXiv.org Machine Learning

2408.15065

Country: